14

2

Genotype, Phenotype, and Environment

Fig. 2.1 The relation among genes, mRNA, proteins, and metabolites. The curved arrows in the

upper half of the diagram denote regulatory processes

Table 2.1 Approximate numbers (variety) of different objects in the human body

Object

Number

Genes

30 000

mRNA

10 Superscript 5105

ProteinsSuperscript normal aa

3 times 10 Superscript 53 × 105

Expressed proteinsSuperscript normal bb

10 cubed10310 Superscript 4104

Cell types

220

CellsSuperscript normal cc

10 Superscript 13101310 Superscript 141014

Superscript normal aaPotential repertoire

Superscript normal bbIn a given cell type

Superscript normal ccExcluding microbial cells hosted within the body and which may be comparably numerous

The bioinformatics landscape was dramatically transformed by the availability

of whole genomes and, at roughly the same time (although there was no especial

connexion between the developments), whole proteomes and whole metabolomes.

Far wider-ranging comparisons could now be carried out; in particular, a global vision

of regulation seemed to be within grasp. Part III focuses on these developments;

Table 2.1 recalls the magnitude, at the level of the raw materials, of the problems to

be solved.

Genomics is concerned with the analysis of gene sequences, and there are two

main territories of this work: (1) comparison of gene sequences, that is analysis of

the relation of a given sequence with other sequences (external correlations); and

(2) analysis of the succession of symbols in sequences (internal correlations). The

first attempts to elucidate the function of sequences whose function is unknown

were by comparing the “unknown” sequence with sequences of known function. It

is based on the principles that similar sequences encode similar protein structures,

and similar structures encode similar functions (there are, however, many examples

for which these principles do not hold). One also compares sequences known to